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1. INTRODUCTION 

A significant amount of recent research work has addressed the 
problem of solving various data management problems in the cloud. 
The major algorithmic challenges in map-reduce computations in- 
volve balancing a multitude of factors such as the number of ma- 
chines available for mappers/reducers, their memory requirements, 
and communication cost (total amount of data sent from mappers 
to reducers). Most past work provides custom solutions to specific 
problems, e.g., performing fuzzy joins in map-reduce |2, 8 |, clus- 
tering 1 3 1, graph analyses 1 1 , 6 , 5 1, and so on. While some problems 
are amenable to very efficient map-reduce algorithms, some other 
problems do not lend themselves to a natural distribution, and have 
provable lower bounds. Clearly, the ease of "map-reducability" is 
closely related to whether the problem can be partitioned into in- 
dependent pieces, which are distributed across mappers/reducers. 
What makes a problem distributable? Can we characterize general 
properties of problems that determine how easy or hard it is to find 
efficient map-reduce algorithms? 

This is a vision paper that attempts to answer the questions de- 
scribed above. We define and study replication rate. Informally, 
the replication rate of any map-reduce algorithm gives the average 
number of reducers each input is sent to. There are many ways to 
implement nontrivial problems in a round of map-reduce; the more 
parallelism you want, the more overhead you face due to having to 
replicate inputs to many reducers. In this paper: 

• We offer a simple model of how inputs and outputs are re- 
lated, enabling us to study the replication rate of problems. 
We show how our model can capture a varied set of prob- 
lems. (Section[2} 

• We study two interesting problems — Hamming Distance-1 
(Section |3j and triangle finding (Section |4j — and show in 
each case there is a lower bound on the replication rate that 
grows as the number of inputs per reducer shrinks (and there- 
fore as the parallelism grows). Moreover, we present meth- 
ods of mapping inputs to reducers that meet these lower bounds 
for various values of inputs/reducer. 

It is our long-term goal to understand how the structure of a prob- 
lem, as reflected by the input-output relationship in our model, af- 
fects the degree of parallelism/replication tradeoff. 

2. THE MODEL 

The model looks simple - perhaps too simple. But with it we 
can discover some quite interesting and realistic insights into the 
range of possible map-reduce algorithms for a problem. For our 
purposes, a problem consists of; 



1. Sets of inputs and outputs. 

2. A mapping from outputs to sets of inputs. The intent is that 
each output depends on only the set of inputs it is mapped to. 

Note that our model essentially captures the notion provenance |7). 
In our context, there are two nonobvious points about this model: 

• Inputs and outputs are hypothetical, in the sense that they 
are all the possible inputs or outputs that might be present 
in an instance of the problem. Any instance of the problem 
will have a subset of the inputs. We assume that an output is 
never made unless at least one of its inputs is present, and in 
many problems, we only want to make the output if all of its 
associated inputs are present. 

• We need to limit ourselves to finite sets of inputs and outputs. 
Thus, a finite domain or domains from which inputs and out- 
puts are constructed is often an integral part of the problem 
statement, and a "problem" is really a family of problems, 
one for each choice of finite domain(s). 

We hope a few examples will make these ideas clear. 

2.1 Examples of Problems 

Example 2.1. Consider the natural join of relations R{A, B) 
and S{B, C). The inputs are tuples in R or S, and the outputs are 
tuples with schema {A, B, C). To make this problem finite, we need 
to assume finite domains for attributes A, B, and C; say there are 
Na, Nb, and Nc members of these domains, respectively. 

Then there are NaNbNc outputs, each corresponding to a triple 
(a, b, c). This output is mapped to the set of two inputs. One is the 
tuple R(a, b)from relation R and the other is the tuple S(b, c)from 
relation S. The number of inputs is NaNb + NbNq. 

Notice that in an instance of the join problem, not all the inputs 
will be present. That is, the relations R and S will be subsets of 
all the possible tuples, and the output will be those triples (a, b, c) 
such that both R{a, b) and S{b, c) are actually present in the input 
instance. 

Example 2.2. For another example, consider finding triangles. 
We are given a graph as input and want to find all triples of nodes 
such that in the graph there are edges between each pair of these 
three nodes. To model this problem, we need to assume a domain 
for the nodes of the input graph with N nodes. An output is thus a 
set of three nodes, and an input is a set of two nodes. The output 
{u, V, w} is mapped to the set of three inputs {it, v}, {u, w}, and 
{v, w}. Notice that, unlike the previous and next examples, here, 
an output is a set of more than two inputs. In an instance of the tri- 
angles problem, some of the possible edges will be present, and the 
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outputs produced will he those such that all three edges to which 
the output is mapped are present. 

Example 2.3. This example is a very simple case of a similar- 
ity join. The inputs are binary strings, and since we have to make 
things finite, we shall assume that these strings have a fixed length 
b. There are thus 2** inputs. The outputs are pairs of inputs that 
are at Hamming distance 1; that is, the inputs differ in exactly one 
bit. There are thus (6/2)2* outputs, since each of the 2^ inputs is 
Hamming distance 1 from exactly b other inputs ~ those that differ 
in exactly one of the b bits. However, that observation counts every 
pair of inputs at distance 1 twice, which is why we must divide by 
2. 

Example 2.4. Suppose we have a relation R{A, B) and we 
want to implement group-by-and-sum: 

SELECT A, SUM(B) 

FROM R 
GROUP BY A; 

We must assume finite domains for A and B. An output is a value 
of A, say a, chosen from the finite domain of A-values, together 
with the sum of all the B -values. This output is associated with a 
large set of inputs: all tuples with A-value a and any B-value from 
the finite domain of B. In any instance of this problem, we do not 
expect that all these tuples will be present, but as long as at least 
one of them is present, there will be an output for this value a. 

2.2 Mapping Schemas and Replication Rate 

For many problems, there is a tradeoff between the number of re- 
ducers to which a given input must be sent and the number of inputs 
that can be sent to one reducer. It can be argued that the existence 
of such a tradeoff is tantamount to the problem being "not embar- 
rassingly parallel"; that is, the more parallelism we introduce, the 
greater will be the total cost of computation. 

The more reducers that receive a given input, the greater the com- 
munication cost for solving an instance of a problem using map- 
reduce. As communication tends to be expensive, and in fact is 
often the dominant cost, we'd like to keep the number of reducers 
per input low. However, there is also a good reason to want to keep 
the number of inputs per reducer low. Doing so makes it likely that 
we can execute the Reduce task in main memory. Also, the smaller 
the input to each reducer, the more parallelism there can be and 
the lower will be the wall-clock time for executing the map-reduce 
job (assuming there is an adequate number of compute-nodes to 
execute all the Reduce tasks in parallel). 

In our discussion, we shall use the convention that p is the num- 
ber of reducers used to solve a given problem instance, and q is 
the maximum number of inputs that can be sent to any one reducer. 
We should understand that q counts the number of potential inputs, 
regardless of which inputs are actually present for an instance of 
the problem. However, on the assumption that inputs are chosen 
independently with fixed probability, we can expect the number of 
actual inputs at a reducer to be q times that probability, and there 
is a vanishingly small chance of significant deviation for large q. 
If we know the probability of an input being present in the data is 
X, and we can tolerate qi real inputs at a reducer, then we can use 
q — qi jx to account for the fact that not all inputs will actually be 
present. 

With this motivation in mind, let us define a mapping schema for 
a given problem, with a given value of q, to be an assignment of a 
set of reducers to each input, subject to the constraints that: 

1. No more than q inputs are assigned to any one reducer. 



2. For every output, its associated inputs are all assigned to one 
reducer. We say the reducer covers the output. This reducer 
need not be unique, and it is, of course, permitted that these 
same inputs are assigned also to other reducers. 

The figure of merit for a mapping schema is the replication rate, 
which we define to be the average number of reducers to which 
an input is mapped by that schema. Suppose that for a certain al- 
gorithm, the ith reducer is assigned qt < q inputs, and let / be 
the number of different inputs. Then the replication rate r for this 
algorithm is 

i=l 

We want to derive lower bounds on r, as a function of q, for var- 
ious problems, thus demonstrating the tradeoff between high par- 
allelism (many small reducers) and overhead (total communication 
cost - the replication rate). These lower bounds depend on count- 
ing the total number of outputs that a reducer can cover if it is given 
at most q inputs. We let g{q) denote this number of outputs that a 
reducer with q inputs can cover. 

Observe that, no matter what random set of inputs is present 
for an instance of the problem, the expected communication is r 
times the number of inputs actually present, so r is a good mea- 
sure of the communication cost incurred during an instance of the 
problem. Further, the assumption that the mapping schema assigns 
inputs to processors without reference to what inputs are actually 
present captures the nature of a map-reduce computation. Nor- 
mally, a map function turns input objects into key- value pairs in- 
dependently, without knowing what else is in the input. 

3. THE HAMMING-DISTANCE-1 PROBLEM 

We are going to begin our development of the model with the 
tightest result we can offer. For the problem of finding pairs of bit 
strings of length b that are at Hamming distance 1, we have a lower 
bound on the replication rate r as a function of q, the maximum 
number of inputs assigned to a reducer. This bound is essentially 
best possible, as we shall point to a number of mapping schemas 
that solve the problem and have exactly the replication rate stated 
in the lower boimd. 

3.1 Bounding the Number of Outputs 

The key to the lower bound on replication rate as a function of q 
is a tight upper bound on the number of outputs that can be covered 
by a reducer assigned q inputs. 

Lemma 3.1. For the Hamming-distance-1 problem, a reducer 
that is assigned q inputs can cover no more than {q/2) logj q out- 
puts. 

Proof. The proof is an induction on 6, the length of the bit 
strings in the input. The basis is 6 = 1. Here, there are only two 
strings, so q is either 1 or 2. If g = 1, the reducer can cover no 
outputs. But (q/'i) log2 g is when g = 1, so the lemma holds in 
this case. If g = 2, the reducer can cover at most one output. But 
{q/2) logj g is 1 when g = 2, so again the lemma holds. 

Now let us assimie the bound for h and consider the case where 
the inputs consist of strings of length 6-1-1. Let X be a set of q bit 
strings of length 6+1. Let Y be the subset of X consisting of those 
strings that begin with 0, and let Z be the remaining strings of X - 
those that begin with 1. Suppose Y and Z have y and z members, 
respectively, so g = y + 

An important observation is that for any string in Y , there is at 
most one string in Z at Hamming distance 1. That is, if Ow is in 
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Y, it could be Hamming distance 1 from Iw in Z, if that string 
is indeed in Z, but there is no other string in Z that could be at 
Hamming distance 1 from Ow, since all strings in Z start with 1. 
Likewise, each string in Z can be distance 1 from at most one string 
in Y. Thus, the number of outputs with one string in Y and the 
other in Z is at most min(?y, z). 

So let's count the maximum number of outputs that can have 
their inputs within X. By the inductive hypothesis, there are at 
most {y/2) logj y outputs both of whose inputs are in Y, at most 
(z/2) logj z outputs both of whose inputs are in Z, and, by the 
observation in the paragraph above, at most rnin(y, z) outputs with 
one input in each of Y and Z. 

Assume without loss of generality that y < z. Then the max- 
imum number of strings of length b + 1 that can be covered by a 
reducer with q inputs is 



2 iog2 z + y 



We must show that this function is at most (g/2) logj q, or, since 
q = y + z, we need to show 



y z y H~" z 

2 iog2y+ 2i°S22 + y < iog2(y + z) (i) 

under the condition that z > y. 

First, observe that when y = z. Equation [T] holds with equal- 
ity. That is, both sides become ^(logj y + 1). Next, consider the 
derivatives, with respect to z, of the two sides of Equation[T| d/dz 
of the left side is 

1 , log, e 

while the derivative of the right side is 

1 , / N I0S9 e 

-log2(j/ + z)+ 

Since z > y > 0, the derivative of the left side is always less than 
or equal to the derivative of the right side. Thus, as 2 grows larger 
than y, the left side remains no greater than the right. That proves 
the induction step, and we may conclude the lemma. □ 

3.2 The Tradeoff for Hamming Distance 1 

We can use Lemma [3TT] to get a lower bound on the replication 
rate as a function of q, the maximum number of inputs at a reducer. 

Theorem 3.2. For the Hamming distance I problem with in- 
puts of length b, the replication rate r is at least bj logj q. 

Proof. Sup pose there are p reducers, each with < q inputs. 
By Lemma [3TI there are at most {qi/2) logj qi outputs covered by 
reducer i. 

The total number of outputs, given that inputs are of length b is 
(6/2)2''. Thus, since every output must be covered, and log2q > 
log2qi for all i, we have 



2 1082 9 > 22 



(2) 



(3) 



The replication rate is r = X^f^i li/"^^' ^^'^ i^' s"^™ of 
puts at each reducer divided by the total number of inputs. We can 
move factors in Equation 3 to get a lower bound on r = X^iLi ?i/2'' > 
6/ log2 q, which is exactly the statement of the theorem. □ 



3.3 Upper Bound for Hamming Distance 1 

There are a number of algorithms for finding pairs at Hamming 
distance 1 that match the lower bound of Theorem |3.2| First, sup- 
pose q = 2; that is, every reducer gets exactly 2 inputs, and is 
therefore responsible for exactly one output. Theorem|3.2|says the 



replication rate r must be at least b/ logj 2 — b. But in this case, 
every input string w of length 6 must be sent to exactly b reducers - 
the reducers corresponding to that input and the b inputs that are 
Hamming distance 1 from w. 

There is another simple case at the other extreme. If g = 2*", then 
we need only one re duce r, which gets all the inputs. In that case. 



1. But Theorem 
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says that r must be at least 6/ log2(2 



In |2|, there is an algorithm called Splitting that, for the case of 
Hamming distance 1 uses 2^^*'^'^ reducers, for some even b. Half 
of these reducers, or 2*"^^ reducers correspond to the 2'^^'^ possible 
bit strings that may be the first half of an input string. Call these 
Group I reducers. The second half of the reducers correspond to the 
2^^'^ bit strings that may be the second half of an input. Call these 
Group II reducers. Thus, each bit string of length 6/2 corresponds 
to two different reducers. 

An input w of length b is sent to 2 reducers: the Group-I reducer 
that corresponds to its first 6/2 bits, and the Group-II reducer that 
corresponds to its last 6/2 bits. Thus, each input is assigned to 
two reducers, and the replication rate is 2. That also matches the 
lower bound of 6/ log2(2''/^) = 6/(6/2) = 2. It is easy to observe 
that every pair of inputs at distance 1 is sent to some reducer in 
common. These inputs must either agree in the first half of their 
bits, in which case they are sent to the same Group-I reducer, or 
they agree on the last half of their bits, in which case they are sent 
to the same Group-II reducer. 

We can generalize the Splitting Algorithm to give us an algo- 
rithm whose replication rate r matches the lower bound, for any 
integer r > 2. We must assume that r divides 6 evenly. Thus, 
strings of length 6 can be split into r pieces, each of length b/r. 
We will have r groups of reducers, numbered 1 through r. In each 
group of reducers there is a reducer corresponding to each of the 
2i)-6/r ^^-j gjj-i^gg of length 6 - 6/r. 

To see how inputs are assigned to reducers, suppose to is a bit 
string of length 6. Write w = 'WiW2 ■ ■ ■ w,., where each Wi is of 
length b/r. We send w to the group-i reducer that corresponds to 
bit string wi ■ ■ ■ Wi-iWi+i ■ ■ ■ w,., that is, w with the ith substring 
Wi removed. Thus, each input is sent to r reducers, one in each of 
the r groups, and the replication rate is r. The input size for each 
reducer is g = 2''^^, so the lower bound says that the replication 
rate must be at least 6/ log2 (2''/'') = b/{b/r) = r. That is the 
replication rate of our generalization of the Splitting algorithm is 
tight. 

Finally, we need to argue that the mapping schema solves the 
problem. Any two strings at Hamming distance 1 will disagree in 
only one of the r segments of length b/r. If they disagree in the 
ith segments, then they will be sent to the same Group i reducer, 
because reducers in this group ignore the value in the ith segment. 
Thus, this reducer will cover the output consisting of this pair. 

Figure [T| illustrates what we know. The hyperbola is the lower 
bound. Known algorithms that match the lower bound on replica- 
tion rate are shown with dots. 
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Figure 1: Known algorithms matching the lower bound on 
replication rate 



Left-half 
weight 



Right-half 
weight 



Figure 2: Partitioning by weight. Only the border weights need 
to be repUcated 



3.4 An Algorithm for Large q 

There is a family of algoritlims tliat use reducers with large in- 
put - q well above 2*^^, but lower that 2''. The simplest version 
of these algorithms divides bit strings of length b into left and right 
halves of length b/2 and organizes them by weights, as suggested 
by Fig. |2] The weight of a bit string is the number of I's in that 
string. In detail, for some fc, which we assume divides 6/2, we 
partition the weights into h/(2k) groups, each with k consecutive 
weights. Thus, the first group is weights through fc — 1, the sec- 
ond is weights k through 2k — 1, and so on. The last group has an 
extra weight, 6/2, and consists of weights | — fc through 6/2. 

There are {^Y' reducers; each corresponds to a range of weights 
for the first half and a range of weights for the second half. A string 
is assigned to reducer {i, j), for i,j — 1,2,..., 6/2fc if the left half 
of the string has weight in the range {i — l)k through ik ~ 1 and 
the right half of the string has weight in the range (j — l)fc through 
jk - 1. 

Consider two bit strings uiq and w\ of length 6 that differ in ex- 
actly one bit . Suppose the bit in which they differ is in the left 



half, and suppose that wi has a 1 in that bit. Finally, let w-i be as- 
signed to reducer R. Then unless the weight of the left half of wi 
is the lowest weight for the left half that is assigned to reducer R, 
Wo will also be at R, and therefore R will cover the pair {wo,wi}. 
However, if the weight of wi in its left half is the lowest possi- 
ble left-half weight for R, then wq will be assigned to the reducer 
with the same range for the right half, but the next lower range 
for the left half. Therefore, to make sure that wo and wi share a 
reducer, we need to replicate wi at the neighboring reducer that 
handles wg. The same problem occurs if wq and wi differ in the 
right half, so any string whose right half has the lowest possible 
weight in its range also has to be replicated at a neighboring re- 
ducer We suggested in Fig.|2]how the strings with weights at the 
two lower borders of the ranges for a reducer need to be replicated 
at a neighboring reducer. 

Now, let us analyze the situation, including the maximum num- 
ber q of inputs assigned to a reducer, and the replication rate. For 
the bound on q, note that the vast majority of the bit strings of 
length n have weight close to n/2. The number of bit strings of 
weight exactly n/2 is (^"j)- Stirling's approximation Q gives us 
2"/\/27rn for this quantity. That is, one in 0{^/n) of the strings 
have the average weight. 

If we partition strings as suggested by Fig. [2] then the most pop- 
ulous k X k cell, the one that contains strings with weight 6/4 in 
the first half and also weight 6/4 in the second half, will have no 
more than 
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strings assignedQif fc is a constant, then in terms of the horizontal 
axis in Fig. [T] this algorithm has logj q equal to 6 — logj 6 plus 
or minus a constant. It is thus very close to the right end, but not 
exactly at the right end. 

For the replication rate of the algorithm, if fc is a constant, then 
for any cell there is only a small ratio of variation between the num- 
bers of strings with weights i and j in the left and right halves, for 
any i and j that are assigned to that cell. Moreover, when we look 
at the total number of strings in the borders of all the cells, the 
differences average out so the total number of replicated strings is 
very close to (2A;) /fc^ = 2/k. That is, a string is replicated if either 
its left half has a weight divisible by k or its right half does. Note 
that strings in the lower-left corner of a cell are replicated twice, 
strings of the other 2k — 2 points on the border are replicated once, 
and the majority of strings are not replicated at all. We conclude 
that the replication rate is 1 + § . 

3.5 Generalization to d Dimensions 



The algorithm of Section p!?] can be generalized from 2 dimen- 
sions to d. Break bit strings of length 6 into d pieces of length 
b/d, where we assume d divides 6. Each string of length 6 can thus 
be assigned to a cell in a d-dimensional hypercube, based on the 
weights of each of its d pieces. Assume that each cell has side k in 
each dimension, where fc is a constant that divides h/d. 

The most populous cell will be the one that contains strings 
where each of its d pieces has weight 6/ (2d). Again using Stir- 



Note that many of the cells have many fewer strings assigned, 
and in fact a fraction close to 1 of the strings have weights within 
Vb of 6/4 in both their left halves and right halves. In a realistic 
implementation, we would probably want to combine the cells with 
relatively small population at a single reducer, in order to equalize 
the work at each reducer. 
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ling's approximation, the number of strings assigned to this cell is 

y^fh^dl ' fe'*/2(2^/d)d/2 
On the assumption that k is constant, the value of logj q is 
&-(d/2)log2 6 

plus or minus a constant. 

To compute the replication rate, observe that every point on each 
of the d faces of the hypercube that are at the low ends of their 
dimension must be replicated. The number of points on one face is 
so the sum of the volumes of the faces is dk!^ ^ . The entire 
volume of a cell is k'^ , so the fraction of points that are replicated is 
djk, and the replication rate is 1 + d/fc. Technically, we must prove 
that the points on the border of a cell have, on average, the same 
number of strings as other points in the cell. As in Section [34] 
the border points in any dimension are those whose corresponding 
substring has a weight divisible by k. As long as k is much smaller 
than hjd, this number is close to 1/fcth of all the strings of that 
length. 

4. TRIANGLE FINDING 

In this section, we present a brief description of other results ob- 
tained using our framework, specifically on finding triangles. The 
pattern that lets us investigate any problem is, we hope, clear from 
the analysis of Section [3] 

1. Find an upper bound, (/(g), on the number of outputs a re- 
ducer can cover if q is the number of inputs it is given. 

2. Count the total numbers of inputs |/| and outputs \0\. 

3. Assume there are p reducers, each receiving qi < q inputs 
and covering g{qi) outputs. Together they cover all the out- 
puts. That is > \0\. 

4. Manipulate the inequality from (3) to get a lower bound on 
the replication rate, which is X]r=i 

5. Hopefully, demonstrate that there are algorithms whose repli- 
cation rate matches the formula from (4). 

4.1 The Tradeoff 

We shall briefly show how this method applies to the problem 
of finding triangles introduced in Example |2.2| Suppose n is the 
number of nodes of the input graph. Following the outline just 
given: 

1 . We claim that the largest number of outputs (triangles) a re- 
ducer with at most q inputs occurs when the reducer is as- 
signed all the edges running between some set of k nodes. 
This point was proved, to within an order of magnitude in 
(5j. Suppose we assign to a reducer all the edges between 
a set of k nodes. Then there are (j) edges assigned to this 
reducer, or approximately k^ /2 edges. Since this quantity 
is q, we have k — \/2q. The number of triangles among k 
nodes is (g), or approximately k'^ /6 outputs. In terms of q, 

the upper bound on the number of outputs is ^q^^'^. 

2. The number of inputs is (j) or approximately n^/2. The 
number of outputs is (g), or approximately 

3. So using the formulas from (1) and (2), if there are p reducers 
each with < q inputs: X^iLi — which implies 
thatELi^W/'>nV6. 



4. The replication rate is X^iLi 1^ divided by the number of in- 
puts, which is n^/2 from (1). We can manipulate the in- 
equality from (3) to get 

5. There are known algorithms that, to within a constant facton 
match the lower bound on replication rate. See |6| and 1 1 |r| 

Generalizing to multiway joins Finding triangles is equivalent 
to computing the multiway join E{A, B)SzE{B, C)SzE{C, A). Sim- 
ilar techniques can be used to compute lower and upper bounds for 
any multiway join. In particular, in the case where we have one 
relation of arity a and the multiway join uses m variables then we 
get lower and upper bounds that are both 0{q^^"^^°'n"^~°-). 

5. SUMMARY 

This abstract introduced a simple model for defining map-reduce 
problems, enabling us to study their "distributability" properties. 
We studied the notion of replication rate, which is closely related 
communication cost, and the number of machines available for 
mappers and reducers. We showed that our model effectively cap- 
tures a multitude of map-reduce problems, and is a natural formal- 
ism for the study of replication rate. We presented a detailed treat- 
ment of the hamming-distance- 1 problem, providing tight bounds 
on the replication rate. We also presented a summary of some other 
results on multiway joins and triangle finding we have obtained. 

We believe that our formalism presents a new direction for the 
study of a large class of map-reduce problems, and allows us to 
prove results on the limits of map-reducibility for any algorithm 
for a problem that fits our model. 
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"It is a little tricky to relate these algorithms to the bound, since 
those algorithms assume the actual data graphs are sparse and cal- 
culate replication and input sizes in terms of the number of edges 
rather than nodes. However, on randomly chosen subsets of all 
possible edges, they do get us within a constant factor of the lower 
bound. 
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